fix(webhook): merge dataset annotations in multi-round FUSE injection by Sukuna0007Abhi · Pull Request #5974 · fluid-cloudnative/fluid

Sukuna0007Abhi · 2026-06-10T18:17:26Z

Ⅰ. Describe what this PR does

This PR fixes an edge case in the DatasetUsageInjector webhook where it was completely overwriting the fluid.io/dataset-in-use annotation instead of merging it.

I noticed there was a // TODO comment left in the codebase mentioning that multi-round sidecar injection wasn't fully supported yet. So, if a Pod got mutated in multiple rounds (for example, injecting different datasets sequentially), the webhook would wipe out the datasets from earlier rounds and only keep the last one.

To fix this, I updated the logic to check if the annotation already exists. If it does, we parse the existing comma-separated string, merge it with the new datasets using a Go map to avoid duplicates, sort them, and safely update the annotation.

Ⅱ. Does this pull request fix one issue?

fixes #5973

Ⅲ. List the added test cases (unit test/integration test) if any, please explain if no tests are needed.

I updated the Ginkgo unit tests in pkg/webhook/plugins/datasetusageinjector/dataset_usage_injector_test.go. Specifically, I modified the Context("when pod has different annotation value") block to explicitly verify that the old dataset and the new dataset are correctly merged into a single, sorted comma-separated string, rather than being overwritten.

Ⅳ. Describe how to verify it

You can verify this by running the standard webhook test suite: go test -v ./pkg/webhook/plugins/datasetusageinjector/...

To manually verify, you can create a Pod that requests multiple datasets which trigger sequential mutations, and then check the Pod's annotations using kubectl get pod -o yaml. The fluid.io/dataset-in-use annotation will now contain all the injected datasets.

Ⅴ. Special notes for reviews

The string manipulation and deduplication logic was kept as lightweight as possible to avoid adding any overhead to the admission controller's critical path.

Signed-off-by: Sukuna0007Abhi <appsonly310@gmail.com>

fluid-e2e-bot · 2026-06-10T18:17:55Z

Hi @Sukuna0007Abhi. Thanks for your PR.

I'm waiting for a fluid-cloudnative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

gemini-code-assist

Code Review

This pull request updates the DatasetUsageInjector to support multi-round injection by merging existing dataset usage annotations with new ones instead of overwriting them. The review feedback points out a potential issue where malformed or empty values in existing annotations could lead to empty strings in the final annotation list, and suggests filtering out empty strings after trimming.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Sukuna0007Abhi · 2026-06-10T18:19:35Z

pls review @cheyang @TrafalgarZZZ @xliuqq

cheyang

Thanks for picking this up, @Sukuna0007Abhi — nice first contribution!

The approach looks solid: reading the existing annotation, deduplicating via a map, and writing back the sorted union. A few observations:

What looks good:

Nil-map access on pod.Annotations is safe in Go (returns zero value), so the read path is correct even when annotations haven't been initialized yet.
The len(existingVal) > 0 guard prevents splitting an empty string into [""].
Deduplication + sorted output ensures idempotent behavior across rounds — nice.
The TODO comment removal is appropriate now that the issue is addressed.

Minor nit (non-blocking):
When parsing the existing annotation value, strings.TrimSpace(ds) could produce an empty string if the stored value is somehow malformed (e.g., trailing comma: "a,b,"). Adding a quick filter would make it fully defensive:

for _, ds := range existingDatasets {
    trimmed := strings.TrimSpace(ds)
    if trimmed != "" {
        datasetsInUseMap[trimmed] = struct{}{}
    }
}

In practice this is unlikely since this code is the only writer of the annotation, but it's a cheap guard. Up to you whether to address it — won't block on this.

Tests: The updated test correctly validates the merge scenario. If you're feeling ambitious, a test for duplicate detection (same dataset already in annotation + in current runtimeInfos) would round things out, but what's here is adequate.

Overall this is a clean, well-scoped fix. LGTM from a code-correctness standpoint.

/ok-to-test

codecov · 2026-06-11T02:41:31Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 63.57%. Comparing base (dae5534) to head (7931adb).
⚠️ Report is 10 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5974      +/-   ##
==========================================
+ Coverage   63.54%   63.57%   +0.03%     
==========================================
  Files         478      479       +1     
  Lines       33217    33290      +73     
==========================================
+ Hits        21107    21165      +58     
- Misses      10432    10445      +13     
- Partials     1678     1680       +2

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sukuna0007Abhi · 2026-06-11T03:18:12Z

Thanks for picking this up, @Sukuna0007Abhi — nice first contribution!

The approach looks solid: reading the existing annotation, deduplicating via a map, and writing back the sorted union. A few observations:

What looks good:

Nil-map access on pod.Annotations is safe in Go (returns zero value), so the read path is correct even when annotations haven't been initialized yet.

The len(existingVal) > 0 guard prevents splitting an empty string into [""].

Deduplication + sorted output ensures idempotent behavior across rounds — nice.

The TODO comment removal is appropriate now that the issue is addressed.

Minor nit (non-blocking): When parsing the existing annotation value, strings.TrimSpace(ds) could produce an empty string if the stored value is somehow malformed (e.g., trailing comma: "a,b,"). Adding a quick filter would make it fully defensive:
for _, ds := range existingDatasets {
    trimmed := strings.TrimSpace(ds)
    if trimmed != "" {
        datasetsInUseMap[trimmed] = struct{}{}
    }
}
In practice this is unlikely since this code is the only writer of the annotation, but it's a cheap guard. Up to you whether to address it — won't block on this.

Tests: The updated test correctly validates the merge scenario. If you're feeling ambitious, a test for duplicate detection (same dataset already in annotation + in current runtimeInfos) would round things out, but what's here is adequate.

Overall this is a clean, well-scoped fix. LGTM from a code-correctness standpoint.

/ok-to-test

Thanks for the review, thanks for approving and merging 🙌

Sukuna0007Abhi · 2026-06-11T06:40:00Z

Pls approve @cheyang @TrafalgarZZZ

Copilot

Pull request overview

Fixes the DatasetUsageInjector mutating webhook so multi-round FUSE sidecar injection merges into the existing fluid.io/datasets-in-use annotation instead of overwriting it, preserving dataset usage tracking across sequential mutations.

Changes:

Merge existing datasets-in-use annotation value with newly injected dataset names via a deduplicating map and sorted output.
Remove the prior TODO by implementing the previously missing multi-round behavior.
Update the Ginkgo test to assert merge behavior rather than overwrite behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
pkg/webhook/plugins/datasetusageinjector/dataset_usage_injector.go	Implements merge/dedupe/sort when writing the datasets-in-use annotation.
pkg/webhook/plugins/datasetusageinjector/dataset_usage_injector_test.go	Updates unit test expectations to validate merged annotation output.

Comments suppressed due to low confidence (1)

pkg/webhook/plugins/datasetusageinjector/dataset_usage_injector_test.go:176

The updated test only covers merging when the existing annotation contains a single dataset. Since the new logic also parses comma-separated values, trims whitespace, and deduplicates via a map, add coverage for (1) existing annotation with multiple entries / whitespace and (2) a duplicate dataset across rounds to ensure the merge logic behaves as intended.

			It("should merge the existing annotation with the new one", func() {
				pod := &corev1.Pod{
					ObjectMeta: metav1.ObjectMeta{
						Name:      "test-update",
						Namespace: "fluid-test",

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cheyang · 2026-06-13T06:42:39Z

Summary: Clean, well-scoped fix that correctly addresses the multi-round FUSE injection annotation overwrite bug.

The approach is sound: read existing annotation values into a map for deduplication, merge new datasets, sort, and write back. Key observations:

Nil-safety is handled correctly. Reading from a nil pod.Annotations map in Go returns the zero value, and the len(pod.Annotations) == 0 guard before writing ensures no nil-map panic.
The len(existingVal) > 0 check correctly avoids splitting an empty string into [""].
Sorted, deduplicated output ensures idempotent behavior across injection rounds.
TODO comment removal is appropriate now that the issue is resolved.

Non-blocking suggestion: Filter out empty strings after TrimSpace to guard against malformed annotation values (trailing commas, etc.). This is defensive since this code is the sole writer, but it is a one-line addition.

Test coverage note: The updated test validates the merge scenario well. A test for partial-overlap deduplication (annotation has datasets A and B, runtimeInfos has B and C, expect A,B,C) would round things out but is not required.

No blocking issues found. The fix is correct and ready for merge.

Review by fluid-cloudnative/fluid agent reviewer

…partial overlap test Signed-off-by: Sukuna0007Abhi <appsonly310@gmail.com>

sonarqubecloud · 2026-06-14T07:53:27Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
38.0% Duplication on New Code

See analysis details on SonarQube Cloud

TrafalgarZZZ

/lgtm

TrafalgarZZZ · 2026-06-15T11:40:08Z

@Sukuna0007Abhi Thanks for fixing this! There's a lint failure in current branch. Could you help also fix that? And we can merge it.

cheyang

/lgtm
/approve

fluid-e2e-bot · 2026-06-15T16:33:15Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheyang, TrafalgarZZZ

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [TrafalgarZZZ,cheyang]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sukuna0007Abhi · 2026-06-15T18:33:13Z

Thank you @cheyang @TrafalgarZZZ

@Sukuna0007Abhi Thanks for fixing this! There's a lint failure in current branch. Could you help also fix that? And we can merge it.

Sorry I was bit offline in that time, but it's merged now thanks, I think the lint got fixed or if not, I will raise another pr,

Thanks again @cheyang @TrafalgarZZZ

fix(webhook): merge dataset annotations in multi-round FUSE injection

1c146bd

Signed-off-by: Sukuna0007Abhi <appsonly310@gmail.com>

fluid-e2e-bot Bot added the needs-ok-to-test label Jun 10, 2026

gemini-code-assist Bot reviewed Jun 10, 2026

View reviewed changes

Comment thread pkg/webhook/plugins/datasetusageinjector/dataset_usage_injector.go

cheyang reviewed Jun 11, 2026

View reviewed changes

fluid-e2e-bot Bot added ok-to-test and removed needs-ok-to-test labels Jun 11, 2026

cheyang requested review from TrafalgarZZZ and Copilot June 11, 2026 06:49

Copilot started reviewing on behalf of cheyang June 11, 2026 06:49 View session

Copilot AI reviewed Jun 11, 2026

View reviewed changes

Comment thread pkg/webhook/plugins/datasetusageinjector/dataset_usage_injector.go

cheyang reviewed Jun 13, 2026

View reviewed changes

Comment thread pkg/webhook/plugins/datasetusageinjector/dataset_usage_injector.go

fix(webhook): filter empty strings in dataset usage injector and add …

7931adb

…partial overlap test Signed-off-by: Sukuna0007Abhi <appsonly310@gmail.com>

Sukuna0007Abhi requested a review from cheyang June 14, 2026 07:53

TrafalgarZZZ approved these changes Jun 15, 2026

View reviewed changes

fluid-e2e-bot Bot assigned TrafalgarZZZ Jun 15, 2026

fluid-e2e-bot Bot added lgtm approved labels Jun 15, 2026

cheyang approved these changes Jun 15, 2026

View reviewed changes

fluid-e2e-bot Bot assigned cheyang Jun 15, 2026

cheyang merged commit 4601594 into fluid-cloudnative:master Jun 15, 2026
25 of 28 checks passed

Conversation

Sukuna0007Abhi commented Jun 10, 2026

Ⅰ. Describe what this PR does

Ⅱ. Does this pull request fix one issue?

Ⅲ. List the added test cases (unit test/integration test) if any, please explain if no tests are needed.

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

Uh oh!

fluid-e2e-bot Bot commented Jun 10, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Sukuna0007Abhi commented Jun 10, 2026

Uh oh!

cheyang left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Sukuna0007Abhi commented Jun 11, 2026

Uh oh!

Sukuna0007Abhi commented Jun 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

cheyang commented Jun 13, 2026

Uh oh!

Uh oh!

sonarqubecloud Bot commented Jun 14, 2026

Quality Gate passed

Uh oh!

TrafalgarZZZ left a comment

Choose a reason for hiding this comment

Uh oh!

TrafalgarZZZ commented Jun 15, 2026

Uh oh!

cheyang left a comment

Choose a reason for hiding this comment

Uh oh!

fluid-e2e-bot Bot commented Jun 15, 2026

Uh oh!

Uh oh!

Sukuna0007Abhi commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov Bot commented Jun 11, 2026 •

edited

Loading